159 research outputs found
Measuring efficiency in high-accuracy, broad-coverage statistical parsing
Very little attention has been paid to the comparison of efficiency between
high accuracy statistical parsers. This paper proposes one machine-independent
metric that is general enough to allow comparisons across very different
parsing architectures. This metric, which we call ``events considered'',
measures the number of ``events'', however they are defined for a particular
parser, for which a probability must be calculated, in order to find the parse.
It is applicable to single-pass or multi-stage parsers. We discuss the
advantages of the metric, and demonstrate its usefulness by using it to compare
two parsers which differ in several fundamental ways.Comment: 8 pages, 4 figures, 2 table
Recommended from our members
Explaining vowel inventory tendencies via simulation: finding a role for quantal locations and formant normalization
Disambiguatory Signals are Stronger in Word-initial Positions
Psycholinguistic studies of human word processing and lexical access provide
ample evidence of the preferred nature of word-initial versus word-final
segments, e.g., in terms of attention paid by listeners (greater) or the
likelihood of reduction by speakers (lower). This has led to the conjecture --
as in Wedel et al. (2019b), but common elsewhere -- that languages have evolved
to provide more information earlier in words than later. Information-theoretic
methods to establish such tendencies in lexicons have suffered from several
methodological shortcomings that leave open the question of whether this high
word-initial informativeness is actually a property of the lexicon or simply an
artefact of the incremental nature of recognition. In this paper, we point out
the confounds in existing methods for comparing the informativeness of segments
early in the word versus later in the word, and present several new measures
that avoid these confounds. When controlling for these confounds, we still find
evidence across hundreds of languages that indeed there is a cross-linguistic
tendency to front-load information in words.Comment: Accepted at EACL 2021. Code is available in
https://github.com/tpimentelms/frontload-disambiguatio
- …